[IR Container] Phase 2.3 Basic shared ptr#5960
[IR Container] Phase 2.3 Basic shared ptr#5960mdavis36 wants to merge 2 commits intomd/fusion-stmt-regfrom
Conversation
|
!test |
|
Review updated until commit fab1748 Description
|
| Relevant files | |||||||||
|---|---|---|---|---|---|---|---|---|---|
| Enhancement |
| ||||||||
| Configuration changes |
|
PR Reviewer Guide
Here are some key observations to aid the review process:
| 🧪 PR contains tests |
| ⚡ Recommended focus areas for review |
Thread Safety
sharing_fusions_ set is modified by addFusion, removeFusion, and transferFusion without any synchronization. When parallel compilation is re-enabled in #5971, concurrent access to these methods from multiple threads could cause race conditions. Consider if std::shared_mutex or other synchronization is needed. |
3a199c8 to
53e5045
Compare
074c814 to
54cd0fe
Compare
53e5045 to
f8ff364
Compare
|
!test |
Greptile SummaryThis PR is the core Phase 2.3 step that converts Key observations:
Confidence Score: 3/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant FA as Fusion A
participant CA as IrContainer A
participant FB as Fusion B
participant CB as IrContainer B
note over FA,CA: Phase 2.3 construction
FA->>CA: make_shared IrContainer
FA->>CA: addFusion(A) → sharing_fusions={A}
FB->>CB: make_shared IrContainer
FB->>CB: addFusion(B) → sharing_fusions={B}
note over FA,CB: Fusion::swap(A, B)
FA->>CA: IrContainer::swap swaps vals and exprs contents
FA->>CA: update stmt->ir_container_ = A
FB->>CB: update stmt->ir_container_ = B
note over CA,CB: sharing_fusions unchanged — each container still tracks its original Fusion
note over FA,CA: Destruction of Fusion A
FA->>CA: clear() destroys vals_up_ and exprs_up_
FA->>CA: removeFusion(A) → sharing_fusions={}
CA-->>FA: shared_ptr refcount drops to 0 IrContainer A destroyed
note over CA,CB: Future risk — ir_container_ptr() can let container outlive Fusion leaving stmt->ir_container_ dangling
Last reviewed commit: dcc99d7 |
Additional Comments (1)
These comments reference "parent backpointers" and say "each Fusion owns a different IrContainer", both of which are artifacts of the old |
54cd0fe to
0a16eaa
Compare
c965408 to
5339bf5
Compare
Additional Comments (4)
class IrContainerFusionPasskey {
friend class Fusion;
private:
explicit IrContainerFusionPasskey() = default;
};
// Then:
void addFusion(IrContainerFusionPasskey, Fusion* fusion);
void removeFusion(IrContainerFusionPasskey, Fusion* fusion);
void transferFusion(IrContainerFusionPasskey, Fusion* from, Fusion* to);At minimum,
|
0a16eaa to
251ae68
Compare
5339bf5 to
24e4f69
Compare
Additional Comments (4)
Consider adding a debug/NVF_ERROR assertion that
A comment explaining why
With The intent is documented in the PR and the TODO comment, but it's worth ensuring CI tests that exercise parallel compilation explicitly account for this temporary change in behavior.
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time! |
|
|
||
| auto ir_cloner = IrContainer::copy(from->ir_container(), to->ir_container()); | ||
| auto ir_cloner = | ||
| IrContainer::copy(from->ir_container(), to->ir_container(), to); |
There was a problem hiding this comment.
This additional argument is only temporary while IrContainer::copy is a valid call from Fusion.
|
!test |
24e4f69 to
fab1748
Compare
Additional Comments (3)
Since
The move constructor delegates to the default constructor first (which calls
When However, the Concretely: The While this dangling entry lives only until |
7d1aa72 to
0716dc5
Compare
Change Fusion::ir_container_ from unique_ptr to shared_ptr to enable future container sharing between Fusions. Add Fusion tracking API to IrContainer (addFusion/removeFusion/transferFusion/sharingCount). Remove IrContainer::parent_ since the 1:1 relationship no longer holds. Disable parallel compilation during the shared_ptr transition.
fab1748 to
dcc99d7
Compare
Additional Comments (4)
The safer ordering would be to call Fusion::~Fusion() {
if (ir_container_) {
ir_container_->removeFusion(this);
// Only clear the backing store when this was the last owner.
if (ir_container_->sharingCount() == 0) {
clear();
}
}
}As written, the invariant "a container is cleared only when no Fusion still needs it" is violated the moment actual sharing is switched on.
The new
The root cause is that
Because Consider wrapping the parallel path in an if (num_groups == 1 ||
isOptionDisabled(DisableOption::ParallelCompile)) {
compileKernel(group_runtime_inputs, group_to_run);
#ifndef PHASE2_DISABLE_PARALLEL_COMPILE
} else {
// launch compileKernel thread here
...
}
#endifAlternatively, a single
This accessor returns the raw A short doc-comment spelling out the contract would prevent misuse: /// Returns the shared_ptr to the underlying IrContainer.
/// IMPORTANT: Any Fusion that takes ownership of this container
/// must call `ir_container_->addFusion(new_fusion)` and, on
/// destruction, `ir_container_->removeFusion(new_fusion)` to keep
/// the sharing_fusions_ tracking accurate.
std::shared_ptr<IrContainer> ir_container_ptr() const {
return ir_container_;
} |
|
!test |
Summary
Replace
unique_ptr<IrContainer>withshared_ptr<IrContainer>in Fusion and add the container-side registration API that tracks which Fusions share a given container.This is the foundational change of Phase 2. The pointer type change alone has no behavioral impact — single-Fusion containers behave identically under
shared_ptr. The tracking infrastructure (sharing_fusions_,addFusion/removeFusion) lays the groundwork for all subsequent tasks.Parallel compilation is disabled in-code via
kPhase2DisableParallelCompile = trueas a precaution during the transition. This ensures CI runs serial compilation for later PRs without requiring environment variables. Parallel compilation is restored in #5971 .Relationship to Phase 2
This is the core type change that enables the shared container model:
Without this change, no subsequent Phase 2 work (per-Fusion tracking, shared-container copy/move, thread safety) is possible. The tracking API is the mechanism that makes container sharing safe — it allows the container to know its consumers, enabling correct cleanup when Fusions are destroyed.
CI Risk
Low. For single-Fusion containers (all existing usage),
shared_ptris behaviorally identical tounique_ptr. No accessor paths change. Parallel compilation is serialized by the in-code constant.